The UCREL Semantic Analysis System
نویسندگان
چکیده
The UCREL semantic analysis system (USAS) is a software tool for undertaking the automatic semantic analysis of English spoken and written data. This paper describes the software system, and the hierarchical semantic tag set containing 21 major discourse fields and 232 fine-grained semantic field tags. We discuss the manually constructed lexical resources on which the system relies, and the seven disambiguation methods including part-of-speech tagging, general likelihood ranking, multi-word-expression extraction, domain of discourse identification, and contextual rules. We report an evaluation of the accuracy of the system compared to a manually tagged test corpus on which the USAS software obtained a precision value of 91%. Finally, we make reference to the applications of the system in corpus linguistics, content analysis, software engineering, and electronic dictionaries.
منابع مشابه
Comparing the UCREL Semantic Annotation Scheme with Lexicographical Taxonomies
Annotation schemes for semantic field analysis use abstract concepts to classify words and phrases in a given text. The use of such schemes within lexicography is increasing. Indeed, our own UCREL semantic annotation system (USAS) is to form part of a web-based ‘intelligent’ dictionary (Herpiö 2002). As USAS was originally designed to enable automatic content analysis (Wilson and Rayson 1993), ...
متن کاملUsing the Ucrel Automated Semantic Analysis System to Investigate Differing Concerns in Refugee Literature
Forced Migration Online (FMO) provides instant access to a wide variety of online resources dealing with forced migrants, and their plight worldwide. The online International Thesaurus of Refugee Terminology (ITRT), in turn, provides users with refugee-related terminology in three languages (English, French and Spanish). The FMO data can already be searched thematically as well as regionally. T...
متن کاملDeveloping an automated semantic analysis system for Early Modern English
As reported by Wilson and Rayson (1993) and Rayson and Wilson (1996), the UCREL semantic analysis system (USAS) has been designed to undertake the automatic semantic analysis of present-day English (henceforth PresDE) texts. In this paper, we report on the feasibility of (re)training the USAS system to cope with English from earlier periods, specifically the Early Modern English (henceforth Emo...
متن کاملComparing and combining a semantic tagger and a statistical tool for MWE extraction
Automatic extraction of multiword expressions (MWEs) presents a tough challenge for the NLP community and corpus linguistics. Indeed, although numerous knowledge-based symbolic approaches and statistically driven algorithms have been proposed, efficient MWE extraction still remains an unsolved issue. In this paper, we evaluate the Lancaster UCREL Semantic Analysis System (henceforth USAS (Rayso...
متن کاملThe ACAMRIT semantic tagging system
Building on a successful previous project, UCREL (the University Centre for Computer Corpus Research on Language) is collaborating with Reflexions Communication Research (a market research company in London, UK) to develop software which will undertake the semantic tagging of words in a text, facilitate the assignment of 'content tags' to those words, and provide a statistical analysis of the r...
متن کامل